Mini-Project #02: Digging Into Affordable Backyards
Author
Nusrat Akter
Published
November 16, 2025
What Is The Mission?
Housing affordability has become one of the most pressing challenges in cities across the United States. In this project, I analyze data to identify which cities are the most “YIMBY” (Yes In My Backyard) and those that embrace growth through flexible zoning and pro-housing policies. Inspired by CityNerd’s work, my goal is to show how encouraging development can make cities more affordable, diverse, and vibrant. Using these findings, I will write a policy brief aimed at convincing congressional representatives to support a federal YIMBY incentive program. A program like this would help reward cities that expand housing opportunities, help lower costs, strengthen communities, and ensure that more Americans can afford to live where they work.
Data Acquisition
Note
We will be using the U.S. Census and ACS data to study housing, wage, and economic trends across Core-Based Statistical Areas (CBSAs). We include data on new housing construction and income from the Bureau of Labor Statistics. Together, these sources reveal patterns in housing growth, wages, and affordability.
Show The Code
if(!dir.exists(file.path("data", "mp02"))){dir.create(file.path("data", "mp02"), showWarnings=FALSE, recursive=TRUE)}library <-function(pkg){## Mask base::library() to automatically install packages if needed## Masking is important here so downlit picks up packages and links## to documentation pkg <-as.character(substitute(pkg))options(repos =c(CRAN ="https://cloud.r-project.org"))if(!require(pkg, character.only=TRUE, quietly=TRUE)) install.packages(pkg)stopifnot(require(pkg, character.only=TRUE, quietly=TRUE))}library(tidyverse)library(glue)library(readxl)library(tidycensus)get_acs_all_years <-function(variable, geography="cbsa",start_year=2009, end_year=2023){ fname <-glue("{variable}_{geography}_{start_year}_{end_year}.csv") fname <-file.path("data", "mp02", fname)if(!file.exists(fname)){ YEARS <-seq(start_year, end_year) YEARS <- YEARS[YEARS !=2020] # Drop 2020 - No survey (covid) ALL_DATA <-map(YEARS, function(yy){ tidycensus::get_acs(geography, variable, year=yy, survey="acs1") |>mutate(year=yy) |>select(-moe, -variable) |>rename(!!variable := estimate) }) |>bind_rows()write_csv(ALL_DATA, fname) }read_csv(fname, show_col_types=FALSE)}# Household income (12 month)INCOME <-get_acs_all_years("B19013_001") |>rename(household_income = B19013_001)# Monthly rentRENT <-get_acs_all_years("B25064_001") |>rename(monthly_rent = B25064_001)# Total populationPOPULATION <-get_acs_all_years("B01003_001") |>rename(population = B01003_001)# Total number of householdsHOUSEHOLDS <-get_acs_all_years("B11001_001") |>rename(households = B11001_001)
When comparing the CBSA’s from 2010-2019, we can see there are several contenders when looking at the table provided. From the years 2010-2019, the CSBA, Houston-Sugar Land-Baytown, TX Metro Area, permitted the largest number of new housing units just in that decade alone. The CBSA code for that particular area is 26420 and permitted a total of 482,075 housing units.
Year In Which Albuquerque, NM Permitted The Most New Housing Units
Based on our data, in 2021, Albuqquerque, NM permitted the most housing units with a total of 4,021 units. Results in this year, however, may be skewed due to the COVID-19 epidemic faced in the year 2020.
State With The Highest Average Individual Income In 2015
Show The Code
library(dplyr)library(stringr)library(tibble)library(DT)library(htmltools)library(scales)extract_state <-function(name) str_match(name, ", ([A-Z]{2})")[, 2]state_df <-tibble(abb =c(state.abb, "DC", "PR"),name =c(state.name, "District of Columbia", "Puerto Rico"))highlight <-function(x) paste0('<span style="color:#0000ff; font-weight:bold;">', x, '</span>')q2_3_tbl <- INCOME %>%filter(year ==2015) %>%left_join(HOUSEHOLDS %>%filter(year ==2015),by =c("GEOID", "NAME", "year")) %>%left_join(POPULATION %>%filter(year ==2015),by =c("GEOID", "NAME", "year")) %>%mutate(state =extract_state(NAME),total_income = household_income * households ) %>%group_by(state) %>%summarise(total_income =sum(total_income, na.rm =TRUE),total_population =sum(population, na.rm =TRUE),avg_individual_income = total_income / total_population,.groups ="drop" ) %>%left_join(state_df, by =c("state"="abb")) %>%filter(!is.na(state)) %>%arrange(desc(avg_individual_income)) %>%transmute(State = name,Abbrev = state,`Avg Individual Income (2015)`= avg_individual_income,Population = total_population,`Total Income`= total_income )# highlighttop_state <- q2_3_tbl$Abbrev[1]q2_3_tbl <- q2_3_tbl %>%mutate(State =ifelse(Abbrev == top_state, highlight(State), State))# export functions with tabledatatable( q2_3_tbl %>%mutate(`Avg Individual Income (2015)`=dollar(`Avg Individual Income (2015)`),`Total Income`=dollar(`Total Income`),Population =comma(Population) ),rownames =FALSE,escape =FALSE, class ="compact stripe hover order-column nowrap",caption = tags$caption(style ="caption-side: top; text-align: left; font-weight:600;","Average Individual Income By State In The Year 2015" ),extensions ="Buttons",options =list(pageLength =10,dom ="Bfrtip",buttons =c("copy", "csv", "excel"),order =list(list(2, "desc")) ))
TipAnswer
The state that had the highest average individual income in 2015 was DC. The average individual income in that year alone was $33,232.88, surpassing states that were close such as Massachusetts (MA) and Connecticut (CT). This illustrates that where you live in the United States greatly affects how much you can earn. This can impact housing costs, schools, and even local services.
Year In Which The NYC CBSA Had The Most Data Scientists In The Country
The year in which NYC CBSA had the most data scientists in the country was 2015. This goes to show how NYC leads in data science employment because of its large tech and finance industries.
Finance and Insurance Claim Largest Share of NYC CBSA Wages in Peak Year
The finance and insurance industry’s share of total NYC wages peaked at 4.6% in 2014. The finance and insurance industry is a pivotal part of NYC’s economy and pays some of the highest wages in the country. This goes to show how dominating the industry is and its role in shaping the city’s economic landscapes.
Initial Visualizations
CBSA Monthly Rent vs. Average Household Income (2009)
Show The Code
library(ggplot2)metro_affordability_2009 <- RENT |>filter(year ==2009) |>inner_join(INCOME |>filter(year ==2009), by =c("GEOID", "year", "NAME"))ggplot(metro_affordability_2009, aes(x = household_income, y = monthly_rent)) +geom_point(color ="#2E86AB", alpha =0.5, size =2) +geom_smooth(method ="lm", color ="#C70039", linewidth =1, se =FALSE) +labs(title ="The Relationship Between Monthly Rent And Average Household Income Per CBSA (2009)",x ="Average Household Income In USD",y ="Median Monthly Rent In USD" ) +scale_x_continuous(labels = scales::dollar_format(), limits =c(0, NA)) +scale_y_continuous(labels = scales::dollar_format(), limits =c(0, NA)) +theme_minimal()
TipWhat Does This Show?
Here we have the relationship between monthly rent and average household income per CSBA in the year 2009 as depicted by the red line shown on the scatter plot. There is a positive correlation between both which in terms tells us that higher-income areas tend to have higher rents. However, income alone doesn’t determine rent prices. Some cities are more affordable than others at the same income level due to housing supply and local rules being instilled.
Relationship Between Total Employment And Healthcare Sector Employment By CBSA Over Time.
Show The Code
library(dplyr)library(ggplot2)library(scales)library(viridis)healthcare_employment_relationship <- WAGES |>group_by(FIPS, YEAR) |>summarise(total_employment =sum(EMPLOYMENT, na.rm =TRUE) /1000, # in thousandshealthcare_employment =sum(EMPLOYMENT[INDUSTRY ==62], na.rm =TRUE) /1000,.groups ="drop" ) |>filter(total_employment >0, !is.na(healthcare_employment))med_share <- healthcare_employment_relationship |>mutate(ratio = healthcare_employment / total_employment) |>summarise(median_ratio =median(ratio, na.rm =TRUE)) |>pull(median_ratio)ggplot(healthcare_employment_relationship, aes(x = total_employment, y = healthcare_employment, color =as.factor(YEAR))) +geom_point(alpha =0.5, size =2) +geom_smooth(method ="lm", color ="#C70039", linewidth =1, se =FALSE) +geom_abline(intercept =0, slope = med_share, linetype ="dashed", color ="gray40") +scale_color_viridis_d(option ="plasma") +labs(title ="Healthcare Employment vs. Total Employment Across CBSAs (2009–2023)",subtitle ="Dashed line shows median healthcare share; red line = linear trend",x ="Total Employment (thousands)",y ="Healthcare Employment (thousands)",color ="Year" ) +theme_minimal(base_size =14) +theme(plot.title =element_text(face ="bold"),plot.subtitle =element_text(size =11, color ="gray40"),legend.position ="bottom",panel.grid.minor =element_blank() )
TipWhat Does This Show?
The plot here shows that bigger cities have more healthcare workers, with the red-line showing this relationship and how it stays steady over time. By doing so, we can interpret that as cities are growing, there is a need for more doctors, nurses, and hospital staff. The gray line shows the average healthcare share and cities above that line have more jobs in that field than the typical. This is due to the fact that they may house bigger facilties such as hospitals and schools.
Changes in Average Household Size by CBSA
Show The Code
library(dplyr)library(ggplot2)library(scales)HOUSEHOLD_SIZE <- POPULATION %>%select(GEOID, NAME, year, population) %>%inner_join(HOUSEHOLDS %>%select(GEOID, year, households), by =c("GEOID", "year")) %>%mutate(avg_household_size = population / households)HOUSEHOLD_SIZE <- HOUSEHOLD_SIZE %>%mutate(highlight =case_when(str_detect(NAME, "New York") ~"NYC",str_detect(NAME, "Los Angeles") ~"Los Angeles",str_detect(NAME, "Chicago") ~"Chicago",TRUE~"Other" ) )ggplot(HOUSEHOLD_SIZE, aes(x = year, y = avg_household_size, group = GEOID)) +geom_smooth(data = HOUSEHOLD_SIZE %>%filter(highlight =="Other"),aes(group = GEOID),method ="loess", se =FALSE, color ="grey80", linewidth =0.8, alpha =0.6) +geom_smooth(data = HOUSEHOLD_SIZE %>%filter(highlight !="Other"),aes(color = highlight),method ="loess", se =TRUE, linewidth =1.3, alpha =0.2, fill ="lightblue") +scale_color_manual(values =c("NYC"="#0ea5b7", "Los Angeles"="#ef4444", "Chicago"="#fbbf24")) +scale_y_continuous("Average Household Size", labels =number_format(accuracy =0.01)) +scale_x_continuous("Year", breaks =seq(2009, 2023, 2)) +labs(title ="Trends in Average Household Size Across CBSAs",subtitle ="Smoothed trends; NYC, LA, and Chicago highlighted",color ="Highlighted CBSAs",caption ="Source: ACS 1-Year Estimates" ) +theme_minimal(base_size =13) +theme(plot.title =element_text(face ="bold", hjust =0.5),plot.subtitle =element_text(hjust =0.5, color ="gray40"),legend.position ="bottom",panel.grid.minor =element_blank() )
TipWhat Does This Show?
From 2009 to 2023, we can see the average number of people per household has changed over time in different metro areas. The colored lines, as shown on the legend, represent the three major metros: Yellow for Chicago, Red for Los Angeles, and Blue for NYC. The grey lines represent all the other cities excluding the big three that was just listed. But what are we looking at exactly though? The data tells us that household sizes are getting smaller or staying the same, meaning that fewer people are living together as opposed to before.
Building Indices Of Housing Affordability And Housing Stock Growth
Rent burden occurs when households spend more than one-third of their income on rent. In this table, in particular, we are looking at Tampa-St. Petersburg-Clearwater, FL rent costs and household income towards rent each year. Higher numbers mean rent takes up more of people’s income. In the year 2023, we can see that the monthly rent in that area was $1,729 with a rent burden index of 62.63. The rent-to-income percentage was 28.5% which is a stark contrast in the year before.
From 2022 to 2023, rent was raised by $252, with the rent-to-income percentage increasing by 2.9%. Values that are above the normal threshold indicate that families may be struggling financially when most of their income is being pushed towards their living arrangements.
Housing costs differ greatly across the country. The ones highlighted in red like Clearlack, CA Metro Area** (where rent consumes **31.2%** of income) show the least affordable cities, and the ones highlighted in blue like **Laconia, NH Metro Area** (where rent is only **r rent_rank_tbl$rent_to_income_pct[11]% of income) show the most affordable. The cities with high costs typically have better jobs but not enough homes as opposed to cities with low costs having cheaper housing or better pay. This comes to show that housing affordability is important and can impact families everywhere.
Housing Growth
Population And Permit
Show The Code
library(dplyr)library(scales)library(DT)# population + permithousing_growth <- POPULATION %>%inner_join(PERMITS, by =c("GEOID"="CBSA", "year"="year")) %>%arrange(NAME, year) %>%group_by(NAME) %>%mutate(pop_5yr_ago =lag(population, 5),pop_growth_5yr = (population - pop_5yr_ago) / pop_5yr_ago,growth_instant = new_housing_units_permitted / population, growth_rate = new_housing_units_permitted / (pop_growth_5yr * pop_5yr_ago) ) %>%ungroup() %>%filter(!is.na(pop_growth_5yr)) %>%mutate(growth_instant_index =rescale(growth_instant, to =c(0, 100)),growth_rate_index =rescale(growth_rate, to =c(0, 100)) )
Note
Our population and permits tables were joined together to help us construct a suitable measure of housing growth.
CBSAs with Highest and Lowest Instantaneous Housing Growth
Housing construction rates differ a lot across cities. The ones highlighted in green like Punta Gorda, FL Metro Area (Growth Index: 63.2) show cities building the most homes per person, and the ones highlighted in red like Wheeling, WV-OH Metro Area (Growth Index: 0.2) show cities building the least. More construction helps keep housing affordable, while less construction can cause shortages and higher prices in cities.
Cities with Highest and Lowest Housing Growth Rates
Highlighted in blue shows cities building homes to match population growth while those highlighted in orange shows cities not building enough. Cities that fall behind are more prone to facing housing shortages and rising costs. The highest-ranked city, which is Springfield, OH, has a growth rate index of 30.4, while the lowest, Mobile, AL is 18.6, showing a wide gap in how cities respond to population increases.
The metro areas highlighted in blue scored the highest on both housing measures while the ones highlighted in orange scored the lowest. Composite scores range from 24.7 to 88. Higher composite scores essentially means that there is better housing construction while lower composite scores means less housing construction.
Visualization
Rent Burden Change Over Time
Show The Code
library(ggplot2)library(dplyr)# early + latest rent burdenrent_change <- rent_burden %>%group_by(NAME) %>%summarise(rent_early =first(rent_burden_index),rent_latest =last(rent_burden_index),rent_diff = rent_latest - rent_early )highlight <- rent_change %>%filter(rent_early >median(rent_early) & rent_diff <0)ggplot(rent_change, aes(x = rent_early, y = rent_latest)) +geom_point(alpha =0.5, color ="#4C72B0") +geom_point(data = highlight, aes(x = rent_early, y = rent_latest), color ="#DD1C77", size =3) +geom_abline(slope =1, intercept =0, linetype ="dashed", color ="#999999") +labs(title ="Early vs Latest Rent Burden",subtitle ="Red points indicate CBSAs with high initial rent and decreasing burden",x ="Rent Burden Index (Start of Study)",y ="Rent Burden Index (Latest Year)" ) +theme_minimal(base_size =14) +theme(plot.title =element_text(face ="bold", size =16),plot.subtitle =element_text(size =12),axis.title =element_text(face ="bold") )
TipWhat Does This Show?
This compares rent burden from the beginning to the end of the study. Cities on the dashed line stayed the same. Cities above the line became less affordable, while cities below became more affordable. The red dots show 164 cities like Aberdeen, WA Micro Area that started with high rent burden (index: 35.3) but improved to 24.2 over time.
This compares rent burden to housing construction in 2023. The red dots show 101 YIMBY cities like Ann Arbor, MI Metro Area that have high rent but are building lots of housing to fix affordability problems. The larger dots in blue represent bigger cities. These metros recognize their housing challenges and are taking action.
Housing For All: The Program That Expands Affordable Homes
Overview
In the United States of America, there are various cities that are currently struggling with housing shortages. With limited housing supply, rent is increased which in turn forces families to relocate farther from their jobs. Essential workers, such as teachers, firefighters, and nurses, are often priced out of the communities they serve.
Housing For All is a federal program designed to encourage local governments to expand housing through YIMBY-oriented policies. What does this program offer?
Helps make zoning and permitting easier
Allows for more apartments to be located near public transits
Reduction of minimum lot sizes
Track housing and job growth in that specific metro area
By tying federal grants to real results, the program helps cities build more housing and support local economic growth.
Who Are Our Target Bill Sponsors?
Primary Sponsor: Representative from Abilene, TX Metro Area
Abilene, TX – rapid employment growth and pro-development policies.
Federal grants aid in city expand housing options, supporting local workers and continuing economic growth.
Co-Sponsor: Representative from San Jose–Sunnyvale–Santa Clara, CA Metro Area
San Jose, CA - high-income city with limited housing supply, increased rent and limited access for essential workers.
Federal incentives aid in increasing housing availability, reduce rent struggles, and improve quality of life for families.
Key Occupations & Benefits
Housing Construction Jobs
Building new housing creates jobs in construction and other skilled trades.
Union groups gain more members and stable employment opportunities.
Teachers, Firefighters, Nurses
High rents in cities like San Jose push essential workers to live far from their jobs, increasing commutes and turnover.
More housing reduces rent. Even a small decrease in rent can save these workers hundreds of dollars per month.
This helps public services run better and keeps communities stable.
Metrics for Success
Rent Burden:The portion of income spent on rent. Keeping it at 30% or less makes housing affordable for families/workers and reduces financial burden.
Growth Metric:The percent change in jobs or housing units over time. Higher percentage in growth shows successful YIMBY policies that have been implemented.
Conclusion
What is The Housing For All Program Really Doing?
Rewarding cities that are already growing, like Abilene
Supporting high-cost cities, like San Jose, to build more housing units
Helping essential workers and construction unions for more opportunists
Using clear, measurable metrics for federal funding and reduce financial hardships
By expanding housing, lowering rent burdens, and supporting local economies, this program makes communities stronger and more livable for everyone.